- IT*- p|<?^W^ pcbA/41 

■5' CTO TC& CTG-TCC CCC TTC_ A& 3 



Attachment B 



short read ^ / ° 

^ long read 




mix 1 - 



CTGCTCCGGCCACTGCCTGAGACTCACc™ 

, gcctctgtcgcttctgtcgctgScccctS 

#Acfe G GCGGCANNCCAGGG TTNAGTCC C T GAGCCCCGCGAGCCCGGGCCGCACACGC 



0SA3 . 1 

short read ^ ^ 



l reaa ^ r* 



cgtctacttcttcgaSS^ 

mix T7 and 0SA3 . 1 V6^rv 

S G J A S CGAGCTCGATCCACT AGTAACGGCCGCCAGTGT(iTCTAAA(±r^ 
JGCAGGGCTACTGCTGCTCCGGCCACTGCCTGAG^ 

AGTCCCTGAGCCCCGCGAGCCC^G^CGCACACGCGACkTGGGrrrrlrS^ 
A rr^^ GGGTGGCGGTGGGG ^ GCGTGG ^^ 

Sc^aSS 



BLASTX 1.3.9MP 



[Build 



Reference: Gish, Warren and David J. States (1993). Identification of protein 
coding regions by database similarity search. Nature Genetics 3:266-72. 
Altschul, Stephen F., Warren Gish, Webb Miller, Eugene W. Myers, and David J. 
Lipman <1990) . Basic local alignment search tool. J. Mol. Biol. 215:403-410. 

Notice: statistical significance is estimated under the assumption that the 
equivalent of one reading frame in the query sequence codes for protein and 

that significant alignments will involve only coding reading frames. 

v ' 

Query= TITLE phasr3. seq 
(447 letters) 

Translating both strands of query sequence in all 6 reading frames 

Database : Non-redundant PDL'+SwissProt+PIR+SPupdate+GenPept+GPupdate, 
EDT ^ ' 

96,634 sequences; 27,090,059 total letters. 
Searching done 

Smallest 
Poisson 





Reading 


High 


Probability 


Sequences producing High-scoring Segment Pairs: 


Frame 


Score 


P(N) 


N 


sp j P27 615| LIM2_RAT 


LYSOSOME MEMBRANE PROTEIN II (L. 


.. +2 


114 


l.le-08 


1 


pir|JQ1523| JQ1523 


lysosomal membrane 85K sialogly. 


+2 


109 


6.3e-08 


1 


sp I P1028 4 | HM26_MOUSE 


HOMEOBOX PROTEIN HOX-2 . 6 . >pir | . 


-2 


61 


2.4e-06 


2 


sp I PI 6 67 1 1 CD3 6_HUMAN 


PLATELET GLYCOPROTEIN IV (GPIV) . 


+2 


94 


l.le-05 


1 


gp I L0 68 50 I HUMCD3 6B_1 


antigen CD36 [Homo sapiens] 


+2 


94 


l.le-05 


1 


gp 1 1*19658 i RATFAT 1 


FAT gene product [Rattus norveg. 


+2 


92 


2.3e-05 


1 


pir(A43932|A43932 


mucin - human (fragment) | 0.0 . 


.. -1 


60 


3.8e-05 


2 


pir|B60492|B60492 


homeotic protein Hox B4 - human. 


-2 


57 


4.0e-05 


2 


Sp|Q01200iPRIA LENED 


PRIA PROTEIN. >pir ( S23106 | S2310 . 


. . -1 


62 


5.6e-05 


2 


pir|S12968|S12968 


Acrosin, sperm - Pig #EC-number. 


.. -2 


59 


6.7e-05 


2 


gp|L23108(MUSCDANTI 1 


CD36 antigen [Mus musculus] 


+2 


88 


9.0e-05 


1 


pir|A45106|A45106 


mucin - human (fragment) | 0.0 . 


. . -1 


60 


9.2e-05 


2 


pir|S31976|S31976 


Cvx peptide - Rat ) 0.0 0.0 0.0. 


.. -3 


57 


0.00012 


2 


gp|Z16406|MOX2A_l 


Mox-2 [Mus musculus] 


-3 


57 


0.00012 


2 


gp | Z 1 7 2 2 3 | RNG AXMR_ 1 


Gax peptide [Rattus norvegicus] 


-3 


57 


0.00012 


2 


sp I P 1 3 9 8 3 | EXTN__TOBAC 


EXTENSIN PRECURSOR (CELL WALL H. 


. . -2 


56 


0.00024 


2 


pir|G60110|G60110 


repetitive protein antigen 69/7. 


-2 


81 


0.00035 


1 


gp (Ml 4721 |MUSFGNAA__1 


Mouse epidermal profilaggrin mR. 


. . -3 


71 


0.0044 


1 


pir|B36664|B36664 


S59/4 homeotic protein - fruit . 


-3 


76 


0.0080 


1 



>sp|P27 6l5|LIM2_RAT LYSOSOME MEMBRANE PROTEIN II (LIMP II) (85 KD LYSOSOMAL 
MEMBRANE SIALOGLYCOPROTEIN) (LGP85) . >pir | A41180 I A41180 74k 
lysosomal membrane protein LIMP - rat ( 0.0 0.0 0.0 0.0 0.0 
>pir | JH0241 | JH0241 85K lysosomal membrane sialoglycoprotein - rat | 
0.0 0.0 0.0 0.0 0.0 >gp|D10587 |RATLGP85_1 LGP85 [Rattus sp.] 
>gp|M68965 |RATLIMPII_1 lysosomal membrane protein [Rattus 
norvegicus] 
Length =478 



Plus Strand HSP3: 



t^wcore = 114 (55.2 bits), Expect = l.le-08, P = l.le-08 
^Identities = 22/64 (34%) , Positives - 36/64 (56%), Frame - +2 
>\ 

jlery: 254 LLCAVIXSVVMILVMPSL^ 433 
<§r LL + +++ V + Q + KN+ + + F W++ P+P Y+ YFF V NP EI 

"Sbjct: 16 LLVTSVTLLVARVFQKAVDQTIEKNMVL 75 

Query: 434 LKGE 445 
L+GE 

Sbjct: 76 LQGE 79 



>pir | JQ1523 | JQ1523 lysosomal membrane 85K sialoglycoprotein precursor - hum an | 
0.0 0.0 0.0 0.0 0.0 >gp|D12676|HUMHLGP85_l 85kT>a human lysosomal 
sialoglycoprotein [Homo sapiens] 
Length - 478 



Plus Strand HSPs; 

Score = 109 (52.8 bits). Expect = 6.3e-08, P = 6.3e-08 
Identities = 21/64 (32%), Positives = 35/64 (54%), Frame = +2 

Query': 254 LLCAVU5VVMILV>lPSLIKQQVI^NVRIDPSSLSFAMVn^ 433 

LL + +++ V +Q + K+ + ++F W++ P+P Y YFF V NP EI 
Sbjct: 16 LLWSVTLLVARVFQKAVDQSIEKKIVLRN 75 

Query: 434 LKGE 445 
L+GE 

Sbjct: 76 LRGE 79 



>sp|P10284|HM26JMOUSE HOMEOBOX PROTEIN HOX-2 . 6 . >pir | A31757 | A31757 homeotic 

protein Hox 2.6 - mouse | 0.0 0.0 0.0 0.0 0.0 >gp (M36654 iMUSHOX26_l 
Mouse homeo box 2.6 (Hox-2.6) mRNA, complete cds. [Mus musculus] 
Length = 250 



Minus Strand HSPs: 



Score = 61 (29.7 bits), Expect - 0.72, P = 0.52 

Identities = 13/41 (31%), Positives = 19/41 (46%), Frame = -2 

Query: 251 PRRPAPPPPSALAVPPMSRVRPGLAGLRDSRGTATEATEAT 129 

P P PPPP + P + V+P L G +EA 
Sbjct: 75 PPPPPPPPPPPPGLSPRAPVQPTAGALLPEPGQRSEAVSSS 115 

Score = 60 (29.3 bits), Expect = 2.4e-06, Poisson P(2) = 2.4e-06 
Identities = 13/25 (52%), Positives = 13/25 (52%), Frame =* -2 



Query: 278 PHRAQRTAAPRRPAPPPPSALAVPP 204 

P QR AA R P PPPP PP 
Sbjct: 59 PCTVQRYAACRDPGPPPPPPPPPPP 83 



>3p|P16671|CD36_HUMAN PLATELET GLYCOPROTEIN IV (GPIV) (GPIIIB) (CD36 ANTIGEN). 

>pir|A30989|A30989 CD36 protein - human | 010 0.G 0.0 0.0 0.0 
>gp|M247 95!Htj>5ANTCD36_l Human CD36 antigen mRNA, complete cds. 



^^^^^^^ ( „™ 3apuna) 

Plus Strand HSPs : 

txtxes 18/64 (28%), Positives = 36/64 (56%), Frame = +2 

Query: 425 sett. 



Query: 425 SEIL 436 
E++ 

Sbjct: 74 QEVM 77 



>gplL06850|HUMCD36B_l antigen CD36 [Homo sapiens] 
Lencrt-h = A 10 ^ J 



Length =472 
. Plus Strand HSPs: 



Score = 94 (45.5 bits), Expect - l.le-05 P -iu« 
Identities = 18/64 (28%) , Positives f J^J ^"J^ = +2 

245 r\™^ «4 

14 VIGAVIAVFGGILMPVGDLLIQKTIKKQVVLEEGTIAFKNWVKTGTEVTOQFWIF^ 73 

425 SEIL 436 
E++ 
74 QEVM 77 



Query 
Sbjct 
Query 
Sbjct 



XPlL»«5..»m, _x », gene product (Rattu3 notvegicua) 



Length = 472 
Plus Strand HSPs: 



Score = 92 (44.5 bits), Expect = 2.3e-05 P - 2 ^ 
Identities - 18/65 <27 % >, Positives^ 3^65 ^5% Frame - +2 

245 ^^^^ ^ 

14 VIGA VLAVFGG I LMP VGDLL I EKT IKRE WLEEGT I AFKNWVKTGTT VTOQFWVFD VQNP 73 



Query 
Sbjct 
Query 
Sbjct 



425 SEILK 439 
E+ K 
74 EEVAK 78 



>pir|A43932|A43932 mucin - human (fragment, I 0.0 0 0 0 0 0 0 0 0 
Sgth 4 !^^ 2 ^ 1 mUC±n lHOm ° "Pie-V ^ ^ 



Minu3 Strand HSP3 : 



Score = 60 (29.! bits). Expect = 1 4, P = 0 74 
Identities = 12/21 f57%» u./4 

it/ti (57%), Positives = 14/21 (66%), Frame = -1 

Query: 279 TTPSTAHSSPTTPSPTATQRP 217 

TTPS ++ TTPSPT T P 
Sbjct: 377 TTPSPPPTTMTTPSPTTTPSP 397 

xz/zu (60%), Positives = 14/20 (70%), Frame = -l 

Query: 285 IITTPSTAHSSPTTPSPTAT 226 

I TTPS ++ TTPSPT T 
Sbjct: 343 ITTTPSPPTTTMTTPSPTTT 362 

>Pi I186 o, 92 = =ho _ lo protei „ HoJt B4 . _ , o o o o o o o o o o 

Minus Strand HSPs: 

Score = 57 (27,8 bits), Expect = 2.9, P = 0.95 

iUe " LiLleS = 12/21 < 57 *>' ^sitive, = 12/21 (57%), Fra^e = -2 

Query: 266 QRTAAPRRPAPPPPSALAVPP 204 

QR AA R P pppp pp 
Sbjct: 63 QRYAACRDPGPPPPPPppppp 83 

-ti/zu (55%), Positives = 12/20 (60%), Frame = -2 

Query: 254 APRRPAPPPPSALAVPPMSR 195 

+PR PAPPP AL P R 
Sbjct: 90 SPRAPAPPPAGALLPEPGQR 109 



>3 P |Q01200|PRIA_LENED PRIA PROTEIN. >pir | S23106 I S23106 

mushroom , 0.0 0.0 0.0 0.0 0 0 >gp7x60956 lKr^aTo^ ~ ******* 
product [Lentinus edod^l 9P AbUSob LEPRIA_1 priA gene 



product [Lentinus edodes] 
Length = 258 

Minus Strand HSP3 : 

Score = 62 (30.0 bits), Expect = 0.61, P = 0 46 

Identities - 13/31 (41%), Positives J /3l ° # ^ = ^ 

Query: 318 TCCLMSEGITRIITTPSTAHSSPTTPSPTAT 226 
<?H-^- nn TCCL + TPS+AH + T SP++T 

SDDCt: 90 TCCLPKWPTSTPTPTPSSAHHTSTHTSPSST 120 

-tJ/JJ (.39%), Positives = 16/33 (48%), Frame = -1 

Query: 276 TPSTAHSSPTTPSPTATQRPGRAAHVACAARAR 178 
... TPS+ +TP P+AT G H A AR 

*t>3Ct: 143 TPSSPSKPSSTPKPSATPNKGNGHHYKRAHVAR 17 5 

>pir|S12968|S12968 Acrosin, sperm - P u iV r ■ u - 

, sperm Pig #bC-number 3.4.21.10 | 0.0 0.0 0.0 0.0 



o.o 

Length =374 
Minu3 Strand HSPs : 



(29%), Poaatxves = 24/48 (50%), Frame = -2 

l uery: 251 srrsrstr^ ». 

See" sT PflWlB, °^^ 3V3 
S£it"i£ 6.7.-05 

Q^ry: 266 QRTAAPRRPAPppp SALAVpp 2Q4 
e . . Q + PR PAPPPP pp 

Sb 3 ct: 294 QPGSRPrppappppp PP p PPP 314 

>gp|L23108|MUSCDANTI 1 CD^fi 

Length ="473 ^ fMu8 «"culus] 

■Plus Strand HSP3: 

(2«%1, Po 3ltl v es . 35/65 (53%, . FraM , +2 



Query 
Sbjct 
Query 
Sbjct 



do Frame = +2 

425 SEILK 439 



++ K 
75 DDVAK 79 



